Skip to content

Conversation

@Anush008
Copy link
Contributor

@Anush008 Anush008 commented Mar 24, 2025

Descrption

This PR adds support to export data to a Qdrant - https://qdrant.tech/ collection.

To run Qdrant,

docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

6333 - REST API, 6334 - gRPC API.
The Rust client calls the gRPC API.

You can find a dashboard at https://localhost:6333/dashboard.

Copy link
Member

@badmonster0 badmonster0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding Qdrant storage support!

@Anush008 Anush008 marked this pull request as ready for review March 27, 2025 07:16
@Anush008
Copy link
Contributor Author

Anush008 commented Mar 27, 2025

Payload conversion is simpler now.
Setup status check is WIP.

@Anush008
Copy link
Contributor Author

Hey @badmonster0.

The fields specified under primary_key_fields are only sent as key fields and the rest as value fields. Correct?
So in case of Qdrant, we try to parse the key fields as UUIDs or Integers to be used as point IDs?

@badmonster0
Copy link
Member

Hey @badmonster0.

The fields specified under primary_key_fields are only sent as key fields and the rest as value fields. Correct? So in case of Qdrant, we try to parse the key fields as UUIDs or Integers to be used as point IDs?

Correct. "key fields" and "value fields" are exclusive.

Copy link
Member

@badmonster0 badmonster0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a warm checkin to see if there's any questions during implementing. Feel free to ask if there's any :)

It looks like the executor logic is almost there. The setup status checker is usually harder. We're also considering making this part optional (e.g. users can manage the backend resources themselves outside cocoindex, as long as the schema they created matches certain expectations). If there're indeed difficulty in this part and you prefer, I can make this change.

Thanks!

@Anush008
Copy link
Contributor Author

Anush008 commented Apr 4, 2025

Hey. I'm on a vacation.

I can make this change

Sure. Thank you 🙏

@badmonster0
Copy link
Member

Hey. I'm on a vacation.

I can make this change

Sure. Thank you 🙏

Added a setup_by_user option for this purpose: https://cocoindex.io/docs/core/flow_def#export

Then when implementing check_setup_status you can directly return an error (also just implemented Infallible for ResourceSetupStatusCheck which provides a concrete type that you can return)

Hope this makes the integration of Qdrant easier by separating into different steps :)

@Anush008
Copy link
Contributor Author

Anush008 commented Apr 8, 2025

ResourceSetupStatusCheck is handled now.
With 4a0bdf5 and a4c6cae

@Anush008
Copy link
Contributor Author

Anush008 commented Apr 8, 2025

Hey @badmonster0! Am I right in my understanding that primary_key_fields are passed as key_fields? If so, would it be sufficient to treat the first key field as the ID when using Qdrant?

67f1e71

@badmonster0
Copy link
Member

Hey @badmonster0! Am I right in my understanding that primary_key_fields are passed as key_fields?

Yes, that's correct.

If so, would it be sufficient to treat the first key field as the ID when using Qdrant?

Yes, this is the right approach in my mind.
primary_key_fields is allowed to have multiple elements as storages like Postgres support primary key on multiple fields.
Since qdrant only support a single key part, the size should always be 1. And we can raise an error (like api_bail!) when StorageFactoryBase::build is called.

@Anush008 Anush008 force-pushed the qdrant branch 2 times, most recently from 636ac3d to 9ce4328 Compare April 13, 2025 10:37
Signed-off-by: Anush008 <[email protected]>
@Anush008
Copy link
Contributor Author

Anush008 commented Apr 13, 2025

Hi @badmonster0 👋. This should be it.

You can now try the example at https://github.com/Anush008/cocoindex/tree/qdrant/examples/text_embedding_qdrant.

…::install_default() before this point

Signed-off-by: Anush008 <[email protected]>
                | BasicValueType::LocalDateTime
                | BasicValueType::OffsetDateTime
                | BasicValueType::Time
                | BasicValueType::Uuid

Signed-off-by: Anush008 <[email protected]>
Copy link
Member

@badmonster0 badmonster0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thank you very much!

@badmonster0 badmonster0 merged commit e5ada33 into cocoindex-io:main Apr 14, 2025
5 checks passed
@Anush008 Anush008 deleted the qdrant branch April 14, 2025 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants